The detection of emphatic words using acoustic and lexical features

نویسندگان

  • Jason M. Brenier
  • Daniel M. Cer
  • Daniel Jurafsky
چکیده

In this study, we describe an automatic detector for prosodically salient or emphasized words in speech. Knowledge of whether a word is emphatic or not could improve Text-to-Speech synthesis as well as spoken language summarization. Previous work on emphasis detection has focused on the automatic recognition of pitch accents. Our model extends earlier research by automatically identifying emphatic pitch accents, a subset of pitch accents that mark special discourse functions with extreme degrees of salience. The overall best performance achieved by our system was 87.8% correct, 8.0% above baseline performance. The results of a feature selection algorithm show that the top-performing features in our models are primarily acoustic measures. Our work identifies important cues for emphasis in speech and shows that it is possible for an automated system to distinguish between two levels of perceived prominence in pitch accents with a high degree of accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Emphatic Information Extraction from Aligned Acoustic Data and Its Application on Sentence Compression

We introduce a novel method to extract and utilize the semantic information from acoustic data. By automatic Speech-ToText alignment techniques, we are able to detect word-based acoustic durations that can prosodically emphasize specific words in an utterance. We model and analyze the sentencebased emphatic patterns by predicting the emphatic levels using only the lexical features, and demonstr...

متن کامل

رویکردی با ناظر در استخراج واژگان کلیدی اسناد فارسی با استفاده از زنجیره‌های لغوی

Keywords are the main focal points of interest within a text, which intends to represent the principal concepts outlined in the document. Determining the keywords using traditional methods is a time consuming process and requires specialized knowledge of the subject. For the purposes of indexing the vast expanse of electronic documents, it is important to automate the keyword extraction task. S...

متن کامل

مدل‌سازی بازشناسی واجی کلمات فارسی

Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...

متن کامل

Developing a Semantic Similarity Judgment Test for Persian Action Verbs and Non-action Nouns in Patients With Brain Injury and Determining its Content Validity

Objective: Brain trauma evidences suggest that the two grammatical categories of noun and verb are processed in different regions of the brain due to differences in the complexity of grammatical and semantic information processing. Studies have shown that the verbs belonging to different semantic categories lead to neural activity in different areas of the brain, and action verb processing is r...

متن کامل

Design and implementation of Persian spelling detection and correction system based on Semantic

Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors.  Also developing Persian tools will provide Persian progr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005